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THE INFINITE ARNOLDI EXPONENTIAL INTEGRATOR FOR 
LINEAR INHOMOGENEOUS ODES 

ANTTI KOSKELA, ELIAS JARLEBRING * 

Abstract. Exponential integrators that use Krylov approximations of matrix functions have 
turned out to be efficient for the time-integration of certain ordinary differential equations (ODEs). 
This holds in particular for linear homogeneous ODEs, where the exponential integrator is equivalent 
to approximating the product of the matrix exponential and a vector. In this paper, we consider linear 
inhomogeneous ODEs, y'(t) = Ay(t) + g(t), where the function g(t) is assumed to satisfy certain reg¬ 
ularity conditions. We derive an algorithm for this problem which is equivalent to approximating the 
product of the matrix exponential and a vector using Arnoldi’s method. The construction is based on 
expressing the function g(t ) as a linear combination of given basis functions [</>*] £2 0 with particular 
properties. The properties are such that the inhomogeneous ODE can be restated as an infinite¬ 
dimensional linear homogeneous ODE. Moreover, the linear homogeneous infinite-dimensional ODE 
has properties that directly allow us to extend a Krylov method for finite-dimensional linear ODEs. 
Although the construction is based on an infinite-dimensional operator, the algorithm can be carried 
out with operations involving matrices and vectors of finite size. This type of construction resembles 
in many ways the infinite Arnoldi method for nonlinear eigenvalue problems m- We prove conver¬ 
gence of the algorithm under certain natural conditions, and illustrate properties of the algorithm 
with examples stemming from the discretization of partial differential equations. 
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equations, Bessel functions 
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1. Introduction. Consider a matrix A £ C nxn and a function g : C —> C n 
with elements which are entire functions. We consider the problem of numerically 
computing the time-evolution of the linear ordinary differential equation with an 
inhomogeneous term 


u (t) = Au(t) + g(t ), u( 0) = uq. 


( 1 . 1 ) 


Our focus will be on equations that arise from spatial semidiscretization of partial 
differential equations of evolutionary type, and A will typically be a large sparse 
matrix, and g will neither be close to linear nor correspond to an extremely stiff 
nonlinearity, in a sense which is further explained in the examples in Section [5] 

The general problem of computing the time-evolution of ODEs can be approached 
with various numerical methods. The method we will present in this paper belongs 
to the class of methods called exponential integrators. Exponential integrators have 
recently received considerable interest; see the review paper Em An attractive feature 
of these methods stems from the combination of approximation of matrix functions 
and the use of Krylov methods |13j . This is mostly due to the superlinear convergence 
of the Krylov approximation of entire matrix functions [12] . 

In this paper we will present a new exponential integrator for m- The integrator 
is constructed using a particular type of expansion of the function g in (11.11) . We will 
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consider expansions of the type 


d(s ) = '^2w i <l>t(s), 
£=0 


( 1 . 2 ) 


where £ C", < £ N, and the basis functions (fro, (fri,■■ ■ are assumed to satisfy 


'M*y 


'Mt)~ 


> o(0)' 

Mt) 

= H 00 

Mt) 




(1.3a) 


where Hqq £ R“ xo ° is an infinite-dimensional Hessenberg matrix, satisfying for a 
fixed constant C > 0, 


\\Hn\\ < C for all iV = 0,..., oo. (1.3b) 

The matrix Hn £ R NxN is the leading submatrix of H^. 

The scaled monomials is the easiest example of such a sequence of functions. If we 
define (fri(t) := t e /£\, i = 0,..., then (11.31) is satisfied with given by a transposed 
Jordan matrix 


H n 


0 

1 0 
1 0 


(1.4) 


In this case, the expansion (EH corresponds to a Taylor expansion and the coefficients 
are given by W( = gW (0), £ = 0,.... We will also see that these properties are satisfied 
for other functions, e.g., the Bessel function and the modified Bessel function of the 
first kind (as we will further explain in Section m - The algorithm will be derived 
and analyzed for these choices of [<fri]'?l 0 . The choice of basis functions can be tailored 
for the problem, and the best choice is problem dependent. This will be illustrated 
in the numerical examples in Section [5] 

The general idea of our approach can be seen as follows. If (fro, <fri, ■ ■ • are the 
scaled monomials, then we can truncate (11.21) at i = TV, yielding y’{t) = Ay(t) + 
Y^e=o w t ( t ) £(f) an d ^ straightforward to verify that the inhomogeneous ODE (11.11) 
can be expressed as a larger linear homogeneous ODE, 



y(t) 


y(t) 


2/(0) 



d 

Mt) 

= A n 

Mt) 


M0) 


2/o 

dt 

i 

1 . . . 

c~+- 

1_ 

i 

-e- 

i ... 

c~+- 

1_ 


<fr N - 1(0)_ 


ei 



where we have defined 


A n := 


A 

0 


W N 

h n 


( 1 . 6 ) 


and Hn £ K JVxAr is the leading TV x TV block of T?oo and Wn ■= [u>o ■ ■ ■ w/v-i] € 
C nxJv . This relation has been used in J2J Theorem 2.1] and also in m- If we combine 
this type of construction with an iterative method (in a particular way), we will here 
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be able to construct an algorithm for m for any sequence of functions </>q, <pi,... 
satisfying m- 

The construction (USD and the matrix (USD resemble in some ways the tech¬ 
nique called companion linearization used for polynomial eigenvalue problems; see 
e.g. mm- The algorithm known as the infinite Arnoldi method m is an algo¬ 
rithm for nonlinear eigenvalue problems (not necessarily polynomial). One variant of 
the infinite Arnoldi method can be interpreted as the Arnoldi method !20] applied 
to the companion linearization of a truncated Taylor expansion. Due to a particular 
structure of the companion matrix, the infinite Arnoldi method is also equivalent to 
the application of the Arnoldi method on an infinite-dimensional companion matrix. 
This equivalence is consistent with the observation that many attractive features of 
the Arnoldi method appear to be present also in the infinite Arnoldi method. 

We will in this paper illustrate that the underlying techniques used to derive 
the infinite Arnoldi method can also be applied to (O). Similar to the infinite 
Arnoldi method, the presented algorithm can be interpreted as an exponential inte¬ 
grator applied to a truncated problem, as well as the integrator applied to an infinite¬ 
dimensional problem. An important feature of this construction is that the algorithm 
does not require a choice of a truncation parameter in the expansion CE2D, making it 
in a sense applicable to arbitrary nonlinearities. 

The paper is structured as follows. The infinite-dimensional properties of the 
algorithm are derived in Section 12.11 Although the construction in Section 12.11 is 
general for essentially arbitrary basis, the convergence proofs are basis dependent. 
We show that the algorithm converges for several bases (scaled monomials, Bessel 
functions and modified Bessel functions) under certain conditions, i.e., the truncation 
of (11.21) converges and the derivatives g^\ 0) of the nonlinearity are bounded with 
respect to the linear operator A in a certain way. This convergence theory is presented 
in Section[4] We illustrate the properties of the algorithm and its variants in Section[5] 
including comparisons with other algorithms. 

We will mostly use standard notation. denotes the element at the *th 

row and jth column of Hjy. Analogously, the colon notation will be used to denote 
entire rows and columns, e.g., 14 ,: corresponds to the vector in the kth row of the 
matrix V. We will also extensively use infinite-dimensional matrices. More precisely, 
we will work with sequences of matrices W/v £ M. nxN , N = 0,..., which are nested, 
i.e., Wn-i £ R" x ( Ar_1 ) are the first N — 1 columns of W/v, and Woo will be the 
corresponding infinite-dimensional matrix. We will also consider sequences of square 
matrices H jy £ R NxN , where H^-i is the leading submatrix of Hm- The infinite¬ 
dimensional operator associated with the limit will be denoted Hoo £ M ooxo °. We 
will use e, to denote the ith unit vector of consistent size. Throughout the paper, 
|| • || denotes the Euclidean vector norm or the spectral matrix norm, unless otherwise 
stated. 

2. Preliminaries. 

2.1. Infinite-dimensional reformulation. At first we will show that the in¬ 
homogeneous ODE m is equivalent to an infinite-dimensional homogeneous ODE. 
The reformulation is illustrated in the following lemma and can be interpreted as an 
analogous transformation illustrated for monomials and truncated Taylor expansion 
in m and m, but without truncation and for arbitrary basis functions satisfying 
0 . 
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Lemma 1 (Infinite-dimensional reformulation) Consider the initial value prob¬ 
lem m, and a sequence of basis functions which satisfy (11.31) . Moreover, sup¬ 

pose that the function g in (O) can be expanded as & and let Woo = [rco, w ±, u> 2 , - - • ] G 
C” x °° denote the expansion coefficients. 

(a) Suppose u(t) is a solution to (11.11) . Then 



u 





u 


'u{ 0)' 



d 

4>o 


A 

Woo' 


4> 0 


</>o(0) 


u 0 

dt 



0 

Hoo 






ei 












(b) Suppose v(t) satisfies 


At) 


A 

0 


Woo 

Hoo 


v(t), 



( 2 . 1 ) 


( 2 . 2 ) 


Then the function u(t) := [l n 0] v(t) is the unique solution to m- 

Proof. The equation m is easily verified by considering the individual blocks. 
The first n rows of (12.11) satisfy u'[t) = Au{t) + JfeLo W^(t) = Au(t) + g(t). Rows 
n + 1, n + 2,... are precisely the conditions in (ll.3al) . In order to show (12.21) . first 
note that the rows n + 1 , n + 2, ... in (12.21) reduce to the equation 


d 

V n +1 (t) 


V n +l{t) 


v n+ i(0) 

dt 


= Hoo 
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Since the operator Hoo has a finite norm by assumption (11.3bl) . it follows from the 
Picard-Lindelof theorem that there exists a unique solution. This solution is the 
sequence of basis functions ..., since they satisfy this ODE by assumption, 

i.e., Un+i +i = 4>i for all i £ N. The conclusion follows by substituting v(t) into the 
first n rows in m • □ 


2.2. Characterization of basis functions <j>£. As mentioned in the introduc¬ 
tion (in particular in formula (11.41) 1. it is straightforward to verify that the scaled 
monomials satisfy the condition m required for the basis functions. Although the 
algorithm described in the following section applies for any basis functions satisfying 
m, we concentrate the discussion on specialized results for two additional types 
of functions. We will now show that the Bessel functions and the modified Bessel 
functions of the first kind satisfy m- 

The Bessel functions of the first kind are defined by (see e.g. p]), Jt{t) := 
- Jq cos (£t — t sin(r)) dr, for <eN, and they satisfy 

m = \vt~i(t) - j i+1 (t)) 

J-e(t) = (-lYMt), £>0 
1 if £ = () 

0 otherwise 

Let J N (t) = [J 0 (t) J\(t) ... JAr_i(t)] T G R N , i.e., a vector of Bessel functions 



(2.3a) 

(2.3b) 

(2.3c) 
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with non-negative index. Moreover, let 


0 

1 

2 


-1 

0 


h n — 


1 

2 



E C NxN 


(2-4) 


From the relations (12.31) . we easily verify that the Bessel functions of the first kind are 
solutions to the infinite-dimensional ODE of the form m, with Hn given by (EH). 
More precisely, 


J'ooif) = H ooJoo(t), Joo(0) = e 1 . 

With similar reasoning we can establish an ODE (11.31) also for the modified Bessel 
functions of the first kind, which are defined by 


hit) := (-i)"J„(it). (2.5) 

and satisfy I' e (t) = \ih-i (t) + h+i (f)), <el These properties lead to the infinite¬ 
dimensional ODE 


T' 00 (t) = H 00 I 00 (t), /(0) = ei , 

where In( t) = [lo(t) I\ (f) ... /;v_i(t)] T and 


Hn = 


0 1 


G C 


NxN 


( 2 . 6 ) 


Therefore, we can show that the Bessel functions and the modified Bessel functions 
of the first kind satisfy m, with an explicitly given constant C. 


Lemma 2 (Basis functions) The conditions for the basis functions in OD are 
satisfied with C = 2 for, 

(a) scaled monomials, i.e., <f>i(t) =t\/i\, with Hoo defined by (11.41) : 

(b) Bessel functions, i.e., <fi(t) = Ji{t), with defined by (EH); and 

(c) modified Bessel functions, i.e., 4>i(t) = lift), with H 0a defined by (12.61) . 

Proof. Statement (a) follows from the definition. The conditions (11.3 a I) have been 
shown already for (b) and (c), since they follow directly from (12.31) and (12.51) . It 
remains to show that the uniform bound (ll.3bl) is satisfied for (b) and (c). Note that 
in both cases (b) and (c) we can express Hn as Hn = Tn + En, where En = 
and Tn is a (finite) band Toeplitz matrix, for any N = 2,... ,oo. We now invoke a 
general result for (finite) band Toeplitz matrices [5j Theorem 1.1] which implies that 
||Tjv|| < UTooll = 1. Hence, \\H N \\ < ||T^|| + H^H = 3/2, and (lL3bl) holds with 
C= 2. □ 
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2.3. Characterization of expansion coefficients we- In principle, the algo¬ 
rithm will be applicable to any problem for which there is a convergent expansion of 
the form El with some coefficients we £ C n , f g N. In practice these coefficients 
may not be explicitly available. We will now characterize a relationship between the 
coefficients and the derivatives of g. This will be necessary in the theoretical conver¬ 
gence analysis in Section H and also useful in numerical evaluation of the coefficients 
(Section d). 

Assume that an expansion of the form El exists and let 
W N = [w 0 wx ... rujv-i] • 

By considering the Zth derivative of g{t) and using the properties of basis functions 
El we have that 

g W(0) = W oo H i oo e 1 =W N H e N e 1 for all £ < N. 

In the last equality we used the fact that H aa is a Hessenberg matrix, and that all 
elements of H^e i except the first l + 1 elements will be zero. The non-zero elements 
will also be equal to H^e\. We now define the upper-triangular matrix 

Kn{H n , ei) = [ei H N ei ... H^~ 1 e 1 ] , (2.7) 

and the matrix Gn as 

G N =[g( 0) g'{ 0) ... ^- 1] (0)] ■ (2.8) 

From the definition it follows that 

W N =G N K N {H N ,e 1 )~ 1 for all N> 1, (2.9) 

under the condition that Abv(-ffjv, ei) is invertible. In a generic situation, the relation 
El can be directly used to compute the coefficients we, £ £ N, given the derivatives 
of g(t). For the Bessel functions and the modified Bessel functions of the first kind, we 
can characterize the coefficients with a more explicit (and more numerically robust) 
formula involving the monomial coefficients of the Chebyshev polynomials of the first 
kind. 

Lemma 3 Let T k< e be the monomial coefficients of the kth Chebyshev polynomial, 
i.e., T k (x ) = 0 Tk/x e - 

(a) For scaled monomials, i.e., 4>k(t) = t\/k\, the expansion coefficients are given 
by iVk = g^ k \ 0), for k £N. 

(v) For the Bessel functions of the first kind, i.e., 4>i{t) = Je(t), the expansion 
coefficients we are given by, 

k 

w o =g(0), w k = 2j2(-l) e T k /g W (0), k = 1,... . 
e=o 

(c) For the modified Bessel functions of the first kind, i.e., 4>effi) = left), the 
expansion coefficients we are given by, 

k 

wo=g(0), w k = 2 y^rfc,iff w (o), k = l,.... 

1 =0 
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Proof. Case (a) follows from the definition. Consider case (c) with the modified 
Bessel functions, i.e., let H N be given by (EH)- The proof is based on showing that 


K N (H N ,e i)" 1 =2 


h T o,o 

0 


Ti, 0 

Tip 


Tn - 1,0 
Tn- i,i 




( 2 . 10 ) 


from which the conclusion follows directly from (12.91) and the fact that Tk,k = 2 k ~ 1 
for any k > 0. 

We will first prove (12.101) for columns k = 2,3,..., TV. From (12.71) and (12.61) we 
directly identify that 


Kn+i{Hn+i, ei) 


K N (H N ,e r) Hga 
0 2~ n 


Moreover, by explicitly formulating the Schur complement [5J Section 3.2.11] we have 
that 


Kn+i(Hn+i, ei) 1 


K N {H N ,e i)- 1 -2 N K N (H N ,e 1 )~ 1 H%e 1 

0 2 n 


( 2 . 11 ) 


Now let pn( A) be the characteristic polynomial of Hn, i.e., pat(A) = det(A I — Hn). 
By expanding the determinant of XI — Hn for the last row, we find that 

Piv(A) = Apvr_i(A) — -Pat— 2 (A). (2-12) 

Now let pvr(A) := 2 N ~ 1 ppf(X) which satisfies the recursion pat(A) = 2Ap)v-i(A) — 
Pn- 2 (A). This is exactly the recursion of the Chebyshev polynomials. We have 
Pi (A) = A = Ti(A) and p^( A) = (2A 2 — 1) = ? 2 (A). Hence, by induction starting with 
N = 1 and TV = 2 it follows that pat(A) = T/v(A), for all TV > 1. Note that p 0 ^ T 0 . 
The Cayley-Hamilton theorem implies that 0 = pn(Hn) = Pn(Hn) = Tjv(TJjv) and 
in particular 0 = 2p(TVjv)ei, i.e., 


N -1 

- 2 N H% ei = ^ N ,iHhei. (2.13) 

2=0 

The first N rows of the last column of (12. lip can now be expressed as 


-2 N K N {H Nl e 1 )- 1 H^e 1 


2K N (H N ,e 1 )~ 1 


( N—l 


J2 T Nti W N e i 


. »=o 


= 2 


Tn, o 
Tn,n-i 


The structure in (12.101) for columns k = 2,..., TV follows by induction. The first 
column can be verfied directly by noting that Ki(Hi, ei) = 1 = Xo.o- 

The proof for the case (b) goes analogously. From (12.61) we see that in this case 
the characteristic polynomial piv (A) of Hn satisfies the recursion 

Pn(X) = Apjv-i(A) + -pat— 2 (A). (2-14) 

Defining p)v(A) = 2 n ~ 1 pn{X) and writing out the recursions we find (similarly to the 
case (c)) that pat(A) = (—i) JV TAr(iA). Comparing (12.121) and (12.141) we see that p)v(A) 
is of the form p)v(A) = \Tn,i\ X e . The claim follows from this. □ 
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Remark 4 (Combining with formulas for the Chebyshev polynomials) The 

coefficients T k ^ are given by the explicit expression (see CD pp. 775]) 


T k {x) = Y j {-lf 


k(k l l)! n fc-2^-i k-2i 

t\{k- 2£)! 


(2.15) 


Thus, when the basis functions <f>t are the modified Bessel functions of the first kind, 
the coefficients of the expansion m are explicitly given by 


w k 


lij 

Ei- 1 )' 

/=o 


k(k t 1)! ofc—2^—1 Ak-1i) 

£!(k- 2i)\ " 9 


e=o 




(2.16) 


where in the last step we have used the Cauchy integral formula. For the expansions 
with the Bessel functions of the first kind we get exactly the same formula, with (— l) e 
replaced by 1 in the summand, which is also given in |241 Sec. 9.1]. 

3. Infinite Arnoldi exponential integrator for (11.111 . Consider for the mo¬ 
ment a linear (finite-dimensional) homogeneous ODE 


y'(t) = By{t), 2/(0) = b 


(3.1) 


where y(t) £ C", with the solution given by the matrix exponential y(t) = exp (tB)b. 
Algorithms for m based on Krylov methods are typically constructed as follows, see 
m and references therein for further details. By carrying out N steps of the Arnoldi 
process for B and b we obtain the Hessenberg matrix Fn and the orthonormal matrix 
Qn+i £ C nx ( N+1 ' 1 that satisfy the so called Arnoldi relation 

BQn = QnFn + fN-\-i,N Qn+i^n , ( 3 . 2 ) 

where q* denotes the zth column of Qn+i = [gi,..., Qat+i] = [QniQn+i] and fij 
the i,j element of F/v, and q± = b/(3 with /3 := ||6||. The columns of Qn form an 
orthogonal basis of the Krylov subspace 

)C]sr(B, b) = span(6, Bb ,..., B N ~ 1 b). 

As a consequence of m , the Hessenberg matrix Fjy is the projection of B onto the 
Krylov subspace K.N{B,b), i.e., Fiy = Q* n BQn- 

The Krylov approximation of (E3D is subsequently given by 


y(t) = exp (tB)b w Q N exp (tF N )eij3. (3.3) 

Krylov approximations of the matrix exponential has for instance been used in mm 
[23]. 

The first justification of the proposed algorithm is based on applying a Krylov 
approximation analogous to (13.311 for the infinite-dimensional homogeneous ODE given 
in Lemma [l] Although this construction is infinite-dimensional, it turns out that due 
to the structure of Aoo and the starting vector b = [uj, eJ] T £ C°°, the basis matrix 
Qn lias a particular structure which can be exploited. 







Lemma 5 (Basis matrix structure) Let Qn £ C°° xAr be the matrix generated by 
the Arnoldi method applied to the infinite matrix A^ given by (11.61) and the starting 
vector b = [uj, ef] T £ C°°. Let qij £ C" +1 , for j = 1... TV, be the first n + 1 rows of 
Qn and let q^j £ C, i = 2,..., j = 2,..., N correspond to the rows n + 2, n + 3, .... 
Then, the basis matrix Qn has the block-triangular structure 


Qn 


9i,i 9 i ,2 
0 92,2 

: 0 


91 , N 

92 , N 


0 qN,N 


£ C°° xAr 



(3.4) 


Proof. The proof can be done by induction. For N = 1 the statement is trivial. If 
we assume Qn has the structure (13.41) . at step N the Arnoldi method will generate a 
new vector 9 : ,at+i £ C°° which is a linear combination of Aooq-.,N and the columns of 
Qn- Due to the fact that is a Hessenberg matrix, and A 00 has the structure ed, 
Aooq-.,N will have one more non-zero element than 9 : ,jv. This completes the proof. □ 
The zero-structure in the basis matrix Qn revealed in Lemma [5) suggests that 
we can implement the Arnoldi method for (12.21) by only storing the non-zero part of 
Qn- By noting that the orthogonalization also preserves the basis matrix structure, 
we can derive an algorithm where in every step the basis matrix is expanded by a 
column and a row. We note that the infinite Arnoldi method for nonlinear eigen¬ 
value problems has a similar property na Section 5.1]. The proposed algorithm is 
specified in Algorithm [1] As is common for the Arnoldi method, in Step [7] we used 
reorthogonalization if necessary. 

Another natural procedure to compute a solution to ED would be to truncate 
the matrix and thereby Aoo such that we obtain a linear finite-dimensional ODE 

v'{t) = A m v(t), 6(0) = 


M 0 

ei 


using ED and subsequently applying the standard Krylov approximation (13.31) on 
this finite-dimensional ODE. It turns out that this approach will provide an algorithm 
equivalent to Algorithm [I] if the truncation parameter is chosen larger or equal to 
the number of Arnoldi steps. Hence, in addition to the fact that Algorithm [T] can be 
interpreted as an infinite-dimensional Krylov approximation of (12.21) , the algorithm is 
also equivalent to the finite-dimensional Krylov approximation corresponding to the 
truncated matrix, if the truncation parameter is chosen larger than the number of 
steps. 

Lemma 6 Consider N steps of the Arnoldi method applied to A m £ c(" +m ) x (”+ m ) 
with starting vector b = £ C n+m . Let UN,m be the corresponding Krylov 

approximation, i.e., UN, m '■= [in 0] Qn expftF^eifi. Then, for any m > N, we 
have u= UN,m- 

Proof This follows directly from the zero-structure of the basis matrix in (13.41) . 
which holds also for finite to, when to > N.Q 
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Algorithm 1: The infinite Arnoldi exponential integrator for (11.11) 


Input : uq £ C n , t £ R, wo, w\... £ C n 
output: The approximation u ~ u(t) 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 


Let /3 = ||u 0 ||, Qi = uo/P, E .0 =empty matrix 
for k = 1, 2,..., N do 

Let qt = Q(:, k ) £ C n+k ~ 1 

Compute w := Akqk 

Let Q k be Qk with one zero row added 

Compute h = Q* k w 

Compute w± := w — Q k h 

Repeat Step [5][6] if necessary 

Compute a = ||wx|| 


Let F k = 



h 

a 


Let Qfc+i • [Q k i wj_ /o] 


end 

Let Fn £ 'R NxN be the leading submatrix of F N £ K.( jv + 1 ) xiv 
Compute the approximation = [l n 0] <2^ exp(ti 7 V)ei/3 


4. Convergence analysis. We saw in Lemma [G] that although the Algorithm [T| 
is derived from Arnoldi’s method applied to an infinite-dimensional operator Aoo, the 
result of N steps of the algorithm can also be interpreted as Arnoldi’s method applied 
to the truncated matrix A rn for any m < N. In order to study the convergence we 
will set m = N and use the exact solution associated with the truncated matrix Ajv, 
denoted by u^(t). More precisely, 

u N {t) := [l n 0] exp(tA N )u N - (4.1) 

By trivial subtraction and triangle inequality we have that the error is bounded by 

III# - u(t )II < I \u(t) - u N (t)\\ + \\u N (t) - u^W- (4.2) 

The first term ||u(f) — '(tjv(t)|| can be interpreted as an error associated with An (the 
truncation of A^) and is not related to Arnoldi’s method, whereas the second term 
11 un (i) ^ ^#11 = | |txat(^) — WAr,./v|| can be seen as an error associated with the Arnoldi 
approximation of the matrix exponential. The following two subsections are devoted 
to the characterization of these two errors. 

4.1. Bound for the truncation error. It will turn out that the truncation 
error (first term in (14.21) ) can be analyzed by relating it to exp(tiLv)ei, i.e., a vector 
of functions generated by the truncated Hessenberg matrix Hn- Let £jv denote the 
difference between the basis functions and the functions generated by the Hessenberg 
matrix, 


e N {t) := - exp(L£Lv)ei, 


(/) N {t) := 


( (f)o(t) 


(4.3) 
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The following lemma shows that a sufficient condition for the convergence of the first 
term in (14.21) is that ||H' r jv£jv(s)|| —> 0. The following subsections are devoted to the 
analysis of ||Wjv£iv(s)|| for different basis functions, and in particular lead up the 
convergence of the truncation error given under general conditions in Theorem [8] and 
Theorem |TT] 


Lemma 7 Let u be the solution to the ODE (HD and it at be defined as in S3. 
Suppose the expansion S3 is uniformly convergent with respect to s. Then, 


IK*)-ujv(i)|| < J ||e (t-s)j4 ||||< 7 (s) — Wjve sHjv ei|| ds < 

0 

t 

[ ||e ( *~ s)A || ds f max ||#(s) - W^v0iv(>)|| + max 11(s)11. (4.4) 

J \«e[°,t] se[0,t\ ) 

o 

Moreover, for every s < t we have 

||<;(s) — Wjv^jv(s)|| —> 0 as N —> oo. (4-5) 

Proof. The first bound in (14.41) follows from the variation-of-constants formula 

t 

u(t) = e tA uo + J e < - t ~ s ^ A g(s) ds, which gives the exact solution for the ODE (11.11) . 
o 

and from the representation [10[ pp. 248] 

UN = [In o] e tAN U N = [In 0] 


e tA f e (t- s )A W]ve sH N ds 
0 

0 e tHN 


Un 


l 

= e tA u 0 + J e it ~ s)A W N e sHN e 1 ds. 


The second inequality follows by adding and subtracting 0 jv(s). The limit expression 
(14.51) follows from the fact that g(s) — Wn4>n(s) = YjiLn Wt^( s ), which is the 
remainder term in the expansion m- The limit vanishes due to the fact that m 
is uniformly convergent. □ 

4.1.1. Bounds on ||Wjv£jv||: scaled monomial basis. Suppose (f>t are scaled 
monomials such that Hn is a transposed Jordan matrix, i.e., the truncation of » 
The definition of £n yields £^(0) = <£$( 0) — H^e i- It follows from the structure 
(11.41) that, for k < N, H^e i = e*, and H^e i = 0 if k > N. Moreover, 0^(0) = 
e k4>k\d) = e* if k < N and = 0 if k > N. Hence, £^(0) = 0 for all k. Since 

en is analytic and all derivatives vanish, ejv(t) = 0. Hence, we have obtained the 
following result. 


Theorem 8 Suppose f>N are the scaled monomials, given by (f>e(t) := t^/I\, and Hn £ 
S. NxN is the leading submatrix of (11.41) . Then, 

£j v(t) = 0 

where £n{I) is given by (14.31) . Consequently, if the basis functions are the scaled 
monomials, the truncation error ||u(t) — ujv(£)|| —t 0 independent oft. 
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4.1.2. Bounds on ||Wj\r£jv|| : Bessel basis functions. If the basis functions 
are Bessel functions or modified Bessel functions of the first kind, further analysis 
is required to show that ||Wjv£iv|| vanishes. The elements of £n, denoted by £N,k, 
k = 0,..., N — 1, can be bounded as follows. 


Lemma 9 Let £ R NxN be defined as either E3D or m - Then, the vector £jv 
satisfies 


ejv(i) = f Jiy(s)eis[ ds 

Jo 


(4.6) 


and is bounded as follows, 
(a) For all t> 0, 


lkiv(i)|| < 


(§*) 


N 


(N + 1)! 




(4.7) 


(b) Suppose t > 2. Then there exists a constant C{t) depending only on t such 
that for 1 < k < N, 


|£iv,fe(i)| < C{t) 


t k (N — k)\ 
2 2N ~ k (2N -jfc + l)!' 


(4.8) 


Proof. Proof of (14.61) : From the properties (12.31) we see that Jjv(t) satisfies the 
initial value problem J' N (t) = H^J^it) + JM{t)e jv, J(0) = e\. Therefore, the error 
£i v(t) := J N (t) — e tHN e i satisfies the initial value problem e' N (t) = Hn^n + Jn^^n, 
ejv(0) = 0, for which the solution is given by (14.61) . 

The statement 62D follows from properties of Hn and Bessel functions as follows. 
From Lemma H7l we have that ||e^ s ') Hn \\ < \f2e l s and therefore 


IIM*)|| < 



ML 

(JV + 1)! 


y/2 te l , 


\—t\ 

where in the last step we used that for t > 0, |</f>Ar(f)| < 1 2 N \ e* if </>jv = Jn or 
4>n = In, which is a consequence of the formula [2U pp. 49], 

\Jn(z)\ < (4.9) 

n\ 

It remains to show (|4.8|) . We first note that, gp and Lemma [18] with R = t 2 
implies that 


\£N,k(t) | 


J eje^ s ' ,Hn CnJn(s) ds 

o 


< 


e t C(t 2 ) 

f2(N—k) ]\[\ <22N — k 


J s N (t-s) N ~ k ds, 
0 


(4.10) 

where C(t 2 ) is given by (IA.4I) . We identify the integral on right-hand side of (14.101) 
as a scaled Beta function t m+n+1 B(m + 1 ,n+ 1). The conclusion (14.81) now follows 
from the application of a formula for B(m + 1, n + 1) in [TJ pp. 258]. More precisely, 

e‘C(f 2 ) t 2N ~ k+1 N\(N — k)\ t 2 f fe+1 (N-k)\ 

m,kW\- t 2( N -k) m2 2N-k (2N — fc + 1)! [ ’2 2N -k(2N-k + 1)!' 
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□ 

We have now derived a bound on e N and shown that ||e r jv(t)|| — > 0 as N —> oo, 
when the basis functions are the Bessel functions or the modified Bessel functions of 
the first kind. Note that this does not necessarily imply that ||Wjv£jv(£)|| —> 0, since 
the coefficient matrix Wjv may not be bounded for all N. Fortunately, the analyticity 
of g(t) gives us a bound on the growth of the coefficients Wk- 

Lemma 10 Suppose g is analytic in a neighborhood of a disc of radius t centered at 
the origin. Let M t be defined as 


M t = max ||g(A)||. 

|A|=i 


(4.11) 


Let the vectors Wk be the coefficients of the expansion (11.21) of gift), where the functions 
are the Bessel or the modified Bessel functions of the first kind. Then, for 0 < t < 2 
we have the bound 


||i«fc|| < M t k\ for all k>0. 

Moreover, for t > 2, we have the bounds 

||w4c|| < M t k\ for all k > 0, 

and 

ft\ 2 

for all k > f — J +1. 


\w k \\ < M t k \2 ( - 


(4.12) 


(4.13) 


(4.14) 


Proof. The closed form (12.161) for the coefficients Wk implies that for all k > 0 

LU 


Ikfcll < 


k{k-l- 1)! fc _ 2£ _ lM 1 /' g{ A) 




t\ 


2 - k \ / A fe_2 ^ +1 


dA||. 


(4.15) 


Combining (14.111) and (14.151) gives us 

t UJ 

2 


.. M t ^4 k(k-£- 1)! (2 

w k ||< —2^- 

£=0 


l\ 


k-If. 


(4.16) 


Thus, when t < 2, we have that for all k > 0 


wfe < 


M t 2 


2 \ t 


^2 k(k-e-l)\ < M t [2 




t\ 


2k\ 


which gives (14.121) . When t > 2, (|) fc < 1 if 0 < £ < [-|J, and with a similar 

reasoning (14.131) follows from (14.161) . 

In order to show 1)4.141) , we note that (14.161) can be expressed as 


I II < Mt V 
Ml < 


(4.17) 


o 
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where cq = k\ (|) fc and ci = (|) 2 • Q_i if £ > 1. When 1 < l < [|J, 

(fc — £)£ > k — 1, and we see that satisfies q < a^Q-i such that ci < a^co, where 

afc = t 2 /4(fc — 1). The conclusion (14.141) follows from (14.171) and the fact that the 
assumption k > 2 (|) + 1 implies that a k < 1/2 and by taking the limit £ —> oo. □ 
By combining the bound of £jv and the bound of Wf, we arrive at the following 
result for Wn£n(£)- 


Theorem 11 IfWw corresponds to the expansion of g(t) with the Bessel functions 
or the modified Bessel functions of the first kind, then for all t > 0 

||kFjv£Ar(t)|| —> 0, as N — >■ oo. (4-18) 


Consequently, if the basis functions are the Bessel functions or the modified Bessel 
functions, the truncation error ||w(t) — rtAr(f)|| — > 0 independent oft. 

Proof. Consider first the case 0 < t < 2. By 63) and (|4.12D we see that 


N 


N 


N 


|^^£jv,^(t)|| < ||ejv(i)|| XllMI - W £ N(t)\\J2 Mtk ' 7 


e=o 


< 


(I) 


N 


k =0 
N 


k—0 


(N+ 1) 
= V2tM t 


^y/2te t M t (jj (JV! + (N - 1)! + ... + l) 


1 


1 


s N+l (N+1)N 

Consider the case t > 2. Suppose N > k := 


1_) < VltM, (2^) . 


(N 

2(|) 2 + 1 


. We see that 


N 


k 


N 


^2we£ N A t )\\<^2\\ w i\\\ £ N,e{t)\+ IMI l £ iv/(t)| ■ 

e=o e=o t=k+i 


(4.19) 


We now show that both of the terms in the right-hand side of (14.191) vanish as N —> 00 . 
Using the bound (14.131) of Lemma [lOl and the bound (14.81) of Lemma [5] with k = £, 
we see that there exists a constant C%{i) := M t C(t)t k , which are independent of N, 
such that 


DM) I^WI < ftd) t. _/' + !), ■ (1-20) 

Since for £~<k<N, t\(N-£)\ < (£ + 1)\(N-£)\ < (AT+1)! and 2 2N ~ e (2N -£+l)\ > 
2 n ((N — k) + N + 1)!, we see that 

(£+1 )\(N-i)\ < (^ + 1 )! < 1 (4 21) 

2 2N ~\2N -£+l )! “ 2 n ((N -k)+N + l)\ ~ 2 N N N ~ k ' 

By inserting (14.211) into (14.201) we conclude that the first term in (I4.19p vanishes as 
N —> 00 . 
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For the second term of (14.191) , we use the bound (14.141) of Lemma [TT)1 and the 
bound (14.81) . to see that there exists a constant C 3 (t) := M t C(t), which are indepen¬ 
dent of N, such that 


N N / 9 \ 

Y IMI < c 3 (t) Y (jJ 


A 4t 


e.=k +i 


e=k+i 

N 


(2N-1+1)1 


<C 3 (t ) £ 

e=k+i 

<C 3 {t)({N-k- 1) 


t e £\(N — t)\ 

2 2N ~ e (2 N -l+l)\ 

N\ 


1 


(IV+ 2)(IV+ 1) N+lJ ’ 


This implies that the second term in the right-hand side of (14.191) converges to zero 
as IV —> oo and completes the proof. □ 

4.2. Error bounds for the Arnoldi approximation. In order to show con¬ 
vergence of (14.21) we will now study the second term in (14.21) . Let 


Qn 


Qi,N+1 

Q2,N+1 


£ C(n+JV+l)x(lV-|-l) 


where Qi,jv £ C" xJV is the orthonormal matrix and F N = Q* N A N Q N the Hessenberg 
matrix given by the infinite Arnoldi algorithm after N iterations. The Arnoldi relation 
& with B = A n , implies that 


AQi'N + WQ2 ,n = Qi,nF n + fN+l,Nqi,N+ieJf 

HnQ2,N = Q2,nFn + fN+l,Nq2,N+ieJf- 


(4.22) 


The polynomial approximation property of the Arnoldi method m Lemma 3.1] states 
that for any polynomial p of degree less than N we have p{An)un = f3QNp(F m )e\. 
In our situation we can exploit the structure of Ajv when we select p{z) = z e . From 
the second block of p(A n )un we conclude that H l N e i/3 -1 = <92,JV-Pjv e i f° r all I — 
N — 1. By stacking this equation as columns into a matrix equation we find that 
I\ N (F[ N ,e 1 )/3~ 1 = Q 2 ,nK N {F N , ei), such that 

Q 2 ,n = Kn(Hn , ei )Kn{Fn, ei) _1 /3 _1 , (4.23) 

where Km denotes the Krylov matrix, defined in (12.71) . The orthonormality of Qn 
implies that ||Qat|| = 1, from which it follows that ||Qi,jv|| < 1 and ||< 52 ,jv|| < 1- 
Consider An of the form m for a general Hessenberg matrix Hn- The infinite 
Arnoldi approximation at step N is given by 

[In 0] Qjv exp(tFjv)ei^ = Qi !A r exp(tFjv)ei/3, (4.24) 

where /3 = | |mat| | ■ We again use the polynomial approximation property, which implies 
that li^N = QnYIiY Ji^nP- Hence, the second term in the error (14.21) can 

be expressed as 

u N {t) - u^it) = [I n 0] (exp(tA N )u N - Qn exp(tF N )eiP) = a N + b N , (4.25) 
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where 


dN '■= [In 0 ] rpf(tA N )u]y 
In '■= —Qi,NfN(tF N )eiP 


(4.26) 

(4.27) 


and tn denotes the remainder term in the truncated Taylor expansion. We will use 
an explicit representation of rjv, 




( z ) = J2 t = zN tn( z ), 


Z=N 


with the standard definition of (^-functions, 


<Pe(z) ■■= E 


OO b. 

z K 


k =0 


{k + ey. 


= / e 


(l-r)2 


■-I)!’ 


(4.28) 


(4.29) 


4.2.1. Convergence of ajv in (14.2511 . The analysis of (14.251) is separated into 
analysis of qn and 6jy We first need a reformulation of ajv- 


Lemma 12 Let An, un, Tn and ipn be defined as above. Then, the following expres¬ 
sion holds 


n 


a N = (tA) N <p N (tA)u 0 + F t e (tA) N e ip N (tA)g u x) (0) 


i =l 


+ Y t e MtA)W N HYei. 

e=N+i 

Proof. By induction it is readily verified from (11.61) that 


(4.30) 


A 


N 


uo 


A k u 0 + E A k ~ l W N H l Ye i 

el 


i=i 

H%e x 


From this it follows that 


[In 0] r N (tA N ) 
f ( tA) k 

= Y —n— u o 


u o 
ei 

oo k 


= [ln 0]£ 


k=N 

\k-£ 


(' tA N ) k 
k\ 


k=N 


k\ 


E Y.*‘ { Yr- WNH ‘A e ' 


= E 


k=N £=1 
N oo 


<“>*- ,±f:PAAlw N H‘ N -^ + t 


k—N 


k\ 


-UQ 


t=l k=N 


e=N+i k=t 


k\ 


WnHYz i. 


(4.31) 

Since = ff (<?_1) (0) when 0 < £ < N, we find for the second term on the 

last line of (14.311) that 


No o /, N 

E E s^wnh^ = E t^tAf-^N^g^iO). 

1=1k=N ' 1=1 
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For the third term on the last line of (14.311) we see that 


OO OO /, A \ k — g oo 

E 1= E SMtAWNH^e 1} 

£=JV+1 fc=£ ' £=jV+l 

from which the claim follows. □ 

We are now ready to state convergence of the first term in (14.25[) under general 
assumptions about the nonlinearity g. 

Theorem 13 Let An be defined as in m- Assume that for the vectors 0) are 
bounded by 

|| 5 W(0)||<C||A||* (4.32) 

for some constant c € R. Then, on defined by (14.2611 satisfies 

ajv —► 0 as TV —)• oo. 

Proof. We bound the norm of the term (14.301) as 


N 


[In 0] r N (tA N )u N \\ 2 < \\(tA) N ip N (tA)u 0 \\ + || ^ t l {tA) N e ip N (tA)g (e 


i=i 


+ || t e ipe(tA)WNH e N 1 ei||. 

e=N+i 

By Lemma ITTJ1 we get a bound for the first term in (14.331) as 

... N , NtAH^ max(l, e M ^) 

\\(tA) N <p N (tA)u 0 \\ < H—ll- -A-L -1|KI|. 


(4.33) 


TV! 


For the second term in (14.331) . we see that 


N 


N 


ix:* € (^)^w(tA)^- i) ( 0 )ii < x;t«iitAir-^ii^- i) (o)iiii W (tA)ii 


t=i 


t =i 


< g^||M|r mm(1 '° , ‘ M>) = 0||iA||~ max(1 ' e " < ‘ A>) 


£=1 


(TV-1)! ’ 


where C = C/||T^4||. Thus, also the second term in (14.331) converges to zero as TV —> oo. 
For the third term in (14.331) , we use Lemmas [T9] and [20] to find that 


X] SMtAWNH^dW < J2 ^||w(^)||||G J v||||iViv(ffjv,e 1 )- 1 J?^- 1 e 1 | 


e=N +1 


e=N+i 


< 


f; t » max( i^ >,A) ) | | Gwl|2 w (1+ v2 )J v 


e=N+i 


= t N+1 ifiN+i{t) max(l,e /i(tA) )2v / TV(l + V / 2) Ar ||GAr|| 
< 


i^+V max(l,e^ tA ))2\/TV(l + v^HG^He* 


(TV + 1)! 
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By assumption (14.321) 


iMir-i 

IMII 2 -1 ' 



TV—1 

tv— i r 

||Gjv|| < ||Gjv||f = a 

Ell^- 1) (0)ll 2 <c' 

J2\\ A \\ 2e = c \ 

\ 

e=o \ 

e=o V 


Thus also the third term in (14.33[) converges to zero as N —> oo. □ 

4.2.2. Convergence of in (14.251) . Bounding the remainder QpjvbwV-FTOei/? 
in the error expression (14.251) needs in general additional assumptions about Fn. 
Before stating the convergence theorem, we need the following lemma. 

Lemma 14 Assume that 1 < ||-ff/v|| < ||A|| and that (14.321) is satisfied for some 
constant c > 0. Then, for 0 < £ < N 

Ni^erll <(1 + ^)1!^. 

Proof. From cm, cm and (14.231) we see that the Hessenberg matrix Fn is given 
by 

Fn = Q*nA N Qn = Ql jN AQi t N + Q2 ,nHnQ2,N + Q*i,nWnQ2,N 

= Q\ iN AQ 1N + Q* 2 ^ n H n Q 2 ,n + {3- 1 Qi,nGnR-n(F n , ei) _1 , 

where Gjy = [<?(0) g'(0) ... g( Ar ^ 1 )(0)]. Thus, for the norms of the products 

F^ei, 1 < £ < N, we get the following recursion: 

11 -Pjv e i 11 — II {Qi,nAQi,n + Q 2 ,n-^nQ 2 ,n) F n ei|| 

+ /3 -1 || Qi,nGnKn(F n , ei)^ 1 F^ r _1 ei|| 

< max(||A||, ||i?v||)||^- 1 e 1 || +r 1 ||OtW- 1) (0)|| 
KWAwwFfr'dW+p-'cWAw*- 1 , 

since (3 > 1 and Kn(Fn , ei)~ 1 F^~ 1 ei = e£ for 1 < £ < N. By induction we have that 

\\F%eiW < WAW'UFOe^+p-'ceUAW*- 1 = \\A\^ + ^c£\\A\\^ < \\A\\\l+fi^c£). 


We are ready to give the following result, which gives sufficient conditions for the 
convergence of the Arnoldi error. 


Theorem 15 (Arnoldi error) Suppose there exists a constant c > 0 such that 
(14.321) is satisfied. Suppose Algorithm.^ generates a Hessenberg matrix F'n such that 
for some constant C, ||F^|| < C N for all N > 0. Then, b^ given by (14.271) satisfies 

||&zv|| —> 0 as N —> oo. 

Moreover, the Arnoldi error in (14.251) satisfies 

||wjv(t) — u™(i)|| —> 0 as N —> oo. 

Proof. We see from (14.281) that 


rN(tF N )e i 


OO 


E 


(tF N Yei 


k= 1 
18 


(yt (tF N y e i \ 
^ {kN + i)[ )- 


£\ 


(4.34) 






























We see by Lemma^lQ that 

^ (tF N y ei < y 1 (1 + cl)\\tA\\t , ^ \\tA\\* 

+ (kN + m ~ 1 ’^(kN + ey. 

£=o v ' t= o v ’ £—0 v ' (4.35) 

p t|MII 

< (1 + p- l cN)y kN {t\\A\\) < (1 + /T 1 cJV)-p^-. 

In the last inequality above we use Lemma [03 Thus, we see from (14.341) and (14.351) 


that 


IIMI < ||0.v||||r i v(^)e 1 ||/3 < £ (||(W JV || fc || £ |^^ll) P 


k= 1 


£=0 


<(/3 + cA)]T 


e t\\A\\ 


k =1 


(kN)\ PS{P + ci V e 2_^ {kN)[ P 


which converges to zero as N —> oo. □ 


Remark 16 (Assumptions in Theorem 1151) Theorem IT5l is only applicable when 
there exists a constant C such that H-F^H < C N for all N > 0, where Fn is the 
Hessenberg matrix generated by Algorithm[T| This is a restriction on the generality of 
our convergence theory. In our numerical experiments we have seen no indication that 
the assumption should not be satisfied (see Figure l5.2cll . Moreover, the assumption 
can be motivated by certain intuitive uniformity assumptions and the generic behavior 
of Arnoldi’s method for eigenvalue problems, as follows. From the definition of the 
spectral radius, we have 


\\F e N \\ 1/e ^ p(F n ) asf^oo. (4.36) 

Moreover, under the condition that the Arnoldi method approximates the largest 
eigenvalue of A^, we also have 

p(Fjv) —t p^Aaf) as N —> oo. (4.37) 

The operator p(Aoo) is block diagonal and the (l,l)-block is a finite operator A and 
the (2,2)-block is a bounded operator (by assumption (ll.3bl) l. Hence, it is natural 
to assume that p(A oa ) = d £ I exists. If p(A 00 ) exists and the limits (14.361) and 
(14.371) hold also in a uniform sense, we have that H-F/vll 1 ^ ~t di which implies the 
assumption. 

5. Numerical examples. 

5.1. Numerical evaluation of the derivatives g^\ 0). In order to carry out 
N steps of the algorithm, we need the expansion coefficients w o,... ,Wjv, which are 
directly available from the derivatives (/^(0), i = 0, ..., N via (12.91) . If the nonlinear¬ 
ity is not explicit such it is not possible to compute expressions for the derivatives by 
hand, there are several alternatives. One may use, e.g., symbolic differentiation which 
is available for several special functions in MATLAB. or the techniques of automatic 
differentiation can be used j9]. 

Another alternative is to use matrix functions. If an efficient and numerically 
stable matrix function implementation of h(z) is available (see, e.g., [TUI Ch. 4]), one 
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Error at time t 



Size of the expansion N 

(a) 




(b) (c) 


Fig. 5.1: Subfigure (a) shows the absolute error vs. expansion size N for the approx¬ 
imation of f{t) = sin 2 (f) for the three different choices of basis functions. Subfigure 
(b) and (c) shows the error vs. the Krylov subspace size for the Schrodinger example, 
(b) e = 1CT 5 , T = 10, (c) e = 10“ 3 , T = 0.5. 


may use the fact that 



r mo) 


'0 

1 0 


h'iO) 


hiH) ei = 

h"i0)/2 

for H = 


_/i (JV_1) (0)/A!_ 


1 o. 


Also, there exists methods to compute derivatives by numerically integrating the 
contour integral in the Cauchy integral formula [4]. 

5.2. 1-D Schrodinger equation with inhomogeneity. We first consider a 
finite difference spatial discretization (with 100 points) of the initial value problem 

id t u = — ed xx u + fit) sin(2 4 7ra;(l — x)), a; G [0,1], fe[0,T] (5.1) 

subject to periodic boundary conditions, with /(f) = (1 + i)sin(t) 2 and initial con¬ 
dition uix, 0) = exp(—100(x — 0.5) 2 ). Figure PoTTTil depicts the absolute error of the 

iv-i 

approximation/(t) « Wjv exp(ti?jv)ei = w e < t > e, (t), for the three different choices 

of Wn and Hjf, for 1 < N < 50 and t = 6. We again compare the infinite Arnoldi al¬ 
gorithm approximation of u(T) for the three different expansion of fit). In Figures [5. II 
we illustrate the relative 2-norm error of the approximations vs. the Krylov subspace 
size, when e = 10 -5 and e = 10~ 3 . In Fig. 15. la! we observe different truncation errors 
for different basis functions. Analogously, a difference in convergence speed of Alg. [T| 
can be observed in Fig. I5.1bl 

In a sense, the convergence of the linear part (associated with A) dominates the 
total error in the case of the strong linear part (e = 10~ 3 ), and therefore the choice 
of basis does not affect the convergence, which is also observed in Fig. 15.id 

We illustrate the competitiveness of the approach in terms of CPU-timt0 in Fig¬ 
ures [5?2l when e = 10 -5 and e = 10~ 3 . We use three different integrators: the infinite 

J A11 experiments are carried out on a desktop computer with a 2.90 GHz single Pentium processor 
using MATLAB. 
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relative 2-norm error 


Arnoldi algorithm with the Bessel functions of the first kind and the MATLAB im¬ 
plementations of the Runge-Kutta method ode45 and odel5s. 

Note that the Matlab integrators use adaptive time-stepping, and that the infinite 
Arnoldi method performs a single time step for which the subspace size is set a priori. 
When e = 10 -5 , ode45 needed 10,16,25,40,86 time steps to obtain the results of 
Figure Hu2l and odel5s 10,13,51,96,189, respectively. When e = 10~ 3 , ode45 needed 
29,30,31,33,33 time steps, and odel5s 10,15,22,60,124 time steps. 

When the linear part is not very stiff, we see that the explicit integrator ode45 
gives better results than the stiff implicit solver ode23. For this particular simula¬ 
tion setup, the infinite Arnoldi method is faster than the MATLAB Runge-Kutta 
implementations, as can be observed in Figures 15.21 

Figure I5.2cl gives a numerical justification for the assumptions used in the error 
analysis given in Section 14.2.21 We consider the numerical example above with the 
parameter e = 10 -3 . We observe that up to machine precision, ||-Fjv || 1//JV —> p{A) as 
N —► oo, such that the conditions discussed in Remark [16l appear to be satisfied. 
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Fig. 5.2: Subfigures (a) and (b) show the error vs. CPU time in seconds for the 
1-D Schrodinger example, with (a) e = 10 -5 , T = 10, and (b) e = 10 -3 , T = 0.5. 
Subfigure (c) show the indicator \\\F$ || 1 / JV /p(A) — l| 


5.3. 2-D Schrodinger equation with inhomogeneity. In order to illustrate 
generality of the infinite Arnoldi method, we consider a finite difference spatial dis¬ 
cretization (with 100 2 points) of the two-dimensional initial value problem 

id t u = -e(d xx u + d yy u) +f(t)sm(2 4 nx(l-x)y(l-y)), xG[0,l], te[0,T] (5.2) 

subject to periodic boundary conditions, with f(t) as in (15.111 and initial condition 
u(x, 0) = exp(—100((x — 0.5) 2 + (y — 0.5) 2 )) 

We compare the infinite Arnoldi algorithm approximation of u(T ) for the three 
different expansion of fit). Figures [5.31 depict the relative 2-norm error of the ap¬ 
proximations vs. the Krylov subspace size, when e = 5 • 10 -3 and e = 5 • 1CU 2 . We 
see again that the convergence of the linear part starts to dominate the total error as 
the linear part gets larger. 

Figures 15.41 depict the relative 2-norm errors of the approximations of u(t) vs. 
the CPU time when e = 5 • 10 -2 and e = 5 • 10 3 , for the three different integrators: 
infinite Arnoldi with Bessel expansion and Matlab codes ode45 and ode 15s. 
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Fig. 5.3: Error vs. the Krylov subspace size for the Schrodinger example. Left: 
e = 5 • 10~ 3 , T = 10, right: e = 5 • 10" 2 , T = 0.25. 




Fig. 5.4: Error vs. CPU time in seconds for the 2-D Schrodinger example. Left: 
e = 5 • 10~ 3 , T = 10, right: e = 5 • 10" 2 , T = 0.25. 


6. Concluding remarks and outlook. The main contribution of this paper 
is a new algorithm for inhomogeneous linear ODEs and the associated convergence 
theory. The algorithm belongs to a class of methods exponential integrators. Many of 
the techniques that are combined with exponential integrators are likely feasible in this 
situation. For instance, a potentially faster approach can be derived by repeating the 
algorithm for different f, i.e., instead of integrating to t = T directly, the algorithm 
can be applied for hi,..., h m where T = hi + •■■ + h m . Moreover, it seems also 
feasible to apply the algorithm to certain nonlinear equations, by simple linearization 
procedure, although it would certainly not be efficient for all nonlinear problems. See 
m for variants of exponential integrators. 
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Appendix A. Additional results needed in proofs. 

Lemma 17 Let Hn £ M iVxAr be defined as in fljgp or as in m, and let the eigen- 
decomposition of H n be given as Hn = V AV~ X . Then, the condition number of the 
eigenvector matrix in 2-norm, i.e., (V) = ||V P ||||V’ _1 ||, is given by k 2 (V) = \/2. 

Moreover, 

\\e tHN \\<V2e ta{HN) , (A.l) 

where a(A) denotes the spectral abscissa of A. If Hn is given by (12.41) . a(Hpj) = 0 
and if Hn is given by m we have that cv(Hn) < 1. 

Proof. We first consider the case where Hn is defined by (12.61) . Note that H N 
is the colleague matrix of the Chebyshev polynomial Tn(x) [22J Theorem 18.1], and 
we know that Hn has N different eigenpairs (X,v) where A-values are the zeros of 
Tn{x), and ^-vectors are of the form v = [T 0 (A) ... T n _i(A)] T . Thus, Hn has the 

eigendecomposition Hn = VAV -1 , where 

To(t.N-l) 

\ 1 

TN-litN-l)_ 


V = 


To{to) 

TN-l{to) 
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and (to, ..., £jv-i) are the N different zeros of T N (-). The Clrebyshev polynomials 
satisfy a discrete orthogonality condition 


Y T ^) T i(tk) = { N 

,N/2 


»;*3 
i = j = 0 
* = 3 + 0 . 


(A.2) 


With (IA.2I) we verify that VV* is a diagonal matrix where all elements are equal 
to N/2 except the first element which is equal to N. Hence, i?-matrix in the QR- 
decomposition of V* = QR is a diagonal matrix and we conclude that there exists 
Q 6 W nxn such that QQ* = Q*Q = I and V = adiag(\/2,1,..., 1 )Q*, where a = 
\JN/2. We we see that ||F|| = |a| \/2, and ||F _1 || = 1/ |a|. Therefore K 2 (V) = y/2. 

Let now Hm be defined as in m • Define the polynomials T n (x), n > 0 as 
T n [x) = i n T n (—ix), where T n is the nth Chebyshev polynomial. We now use the 
recurrence relation of Chebyshev polynomials; see, e.g., [22; Chapter 3]. We see that 
Ti satisfies Tq(x) = 1, T\{x) = x and T n+ \(x) = —2xT n (x) + T n _i{x). Therefore, 
the eigenvalues of Hn are the zeros of the polynomial T n (x) 1 which are i multiplied 
with the zeros of the polynomial T n (x). The corresponding eigenvectors are of the 


form 


W) 


T n _i(A) . From the condition (IA.2I) it follows that the 


polynomials Tj(x) satisfy the condition 


jv-i [0 , * t ^ 3 

Y Ti{t k )Tj{t k ) = < AT , i=j = 0 

fc=0 [i i+J N/2 , i=j + 0 

and the rest of the proof follows as for the modified Bessel functions. The bound 
(IA.ll) follows from the fact that ||e tffjv || = ||V r e tA C _1 || < n(V)||e tA ||. The conclusion 
about the spectral abscissa if Hm is given by (12.611 follows from Gershgorin’s theorem 
and the conclusion of if Hn is given by (12.41) follows from the fact that the eigenvalues 
of Hjy are imaginary. □ 

Lemma 18 Let Hn be defined either as (|2.4I) or (12.611 and let t > 0. Let R £ R be 
any value such that R> t. Then, the elements of e tHN are bounded as 

{e tH "). .<C(R) A'^'l, (A.3) 


where A = ^ and 


q R +Tr 

C{R) = max(||exp(tiJ 7 v)||, 2 v / 2 --—). (A.4) 

1 — A 

Proof. We may apply directly the bound (3.10) in [3] Sec. 3.7]. We know that 
tHjsi has its spectrum inside the interval [— t, t\, which has the logarithmic capacity 
p = t/2. For the integration contour we take the same ellipse as in [3J, so V = 2tt 
and M(R) = e R+ where R > t can be chosen freely. Let Hn = VDV~ 1 be the 
diagonalization of Hn- From Lemma flTl we know that k(V) = \[2. The bound (3.10) 
of 0 gives (TOll . □ 
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Lemma 19 For any matrix A £ C nxn 

\\MA)\\ < 


and, positive integer £, 
max(l, 

J\ ’ 


where n{A) denotes the logarithmic norm, i.e., n(A) = max{ A : A £ A( A+ r, A )}. 
Proof. From (14.2911 we see that 


11^)11= II je^ A jf^\\dt< 
0 



d t. 


Using the Dahlquist bound ||e' A || < eA A ) and the fact that /x((l —t)A) < max{0, p{A)} 
for 0 < t < 1, the claim follows. □ 


Lemma 20 Let Hn be defined as in <l2~4l) or (Infill . Then, for k>N, 
\\KN(H N ,ei)- 1 H l f 1 ei\\ < 2 Vn (1 + V 2 ) n . 


(A.5) 


Proof. Let pjv(A) = be the characteristic polynomial of Hn- Define 


CUN = ~ 

Ot 0 

, and C(cvn) = 

1 

CUN 


pN- 1 . 



1 


Suppose k > N. Since H^e i = — cteH^e\ = Kn{Hn, ei)a/v, we see that 

HnKn{H n , ei) = [H N ei ... H^ei] = K N (H N , ei)C(ajv), 
and since = Kn(Hn, ei)ejv, we see that 

H k N e r = H k N - N+1 K N (H N , ei )e N = K N (H N , ei )C(a N ) k ~ N+1 e N , 


i.e., 

K n (H n , e^H^ei = C(a N ) k ~ N+1 e N . (A.6) 


We recognize that C(oin) is the companion matrix of the TVth Chebyshev polynomial 
Tat, and that VnC{q.n) = AjvVjv, where Ai,...,Ajv are the zeroes of Tn, An = 
diag(Ai, ..., Ajv) and Vjv is the Vandermonde matrix corresponding to Ai,..., Ajv, 
i.e., 


Vn 


1 Ar 
1 Xn 


,N- 


1-| 


A 


JV-l 

N - 


Thus for £ > 1, Cfa^Y = V N l A 1 n Vn , and subsequently for any matrix norm || • ||* 
||C'(ajv)^||* < ||V^ 1 ||*||A f ||*||V} v ||* < ||U A ^ 1 ||*||U/v||* for all t > 1, (A.7) 

since |Aj| < 1 for all 1 < i < N. From [3 Thm. 4.3 and Example 6.2], we know that 

II^HoollUvIloo <2(1 + 72)^. (A.8) 

Thus, using (1A.6I) . we see that for k> N \\KN(HN,ei)~ 1 H l f / ei\\ < ||C(aAr) fc ~ Ar+1 || < 
V A /V : ||C'(ajv) fe_Ar+1 ||oo and the statement follows from (1A.7D and (1A.8I) . □ 
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